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Amendment to the Specification: 
Please amend the specification as follows. 
Please replace the pending title with the following new title: 
METHODS OF GENERATING VARIANTS OF AMIDASE-ENCODING NUCLEIC ACIDS 

Please replace paragraph [0001] on page 1 with the following amended paragraph: . 

— > 

This application is a continuation in part of and claims the benefit of U.S. 
Application Serial No. 09/609,570, filed June 30, 2000, now U.S. Patent No. 6,465,204, issued 
October 15, 2002 [[currently pending]]; which is a divisional of U.S. Application Serial No. 
09/427,372, filed October 25, 1999, now U.S. Patent No. 6,500,659, issued December 31, 2002 
[[currently pending]]; which is a divisional of U.S. Application Serial No. 09/261,006, filed 
March 2, 1999, now U.S. Patent No. 6,004,796; which is a [[continuation]] divisional of U.S. 
Application Serial No. 08/664,646, filed June 17, 1996, now U.S. Patent No. 5,877,001, all of 
which are herein incorporated by reference in their entirety. 

Please replace paragraph [0006] on page 2 with the following amended paragraph: 

In the present invention, it is shown that amidase is such an enzyme [[enzuyme]] 
and is useful for the removal of arginine, phenylalanine, or methionine amino acids from the N- 
terminal end of peptides in peptide or peptidomimetic synthesis. The enzyme is selective for the 
L, or "natural" enantiomer of the amino acid derivatives and is therefore useful for the 
production of optically active compounds. These reactions can be performed in the presence of 
the chemically more reactive ester functionality, a step which is very difficult to achieve with 
nonenzymatic methods. The enzyme is also able to tolerate high temperatures (at least 70° C), 
and high concentrations of organic solvents (>40% DMSO), both of which cause a disruption of 
secondary structure in peptides, which enables cleavage of otherwise resistant bonds. 

Please replace paragraph [0038] on pages 7 to 8 with the following amended paragraph: 

The term "polypeptide" as used herein, refers to amino acids joined to each other 

by peptide bonds or modified peptide bonds, i.e., peptide isosteres, and may contain modified 

amino acids other than the 20 gene-encoded amino acids. The polypeptides may be modified by 

either natural processes, such as post-translational processing, or by chemical modification 
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techniques which are well known in the art. Modifications can occur anywhere in the 
polypeptide, including the peptide backbone, the amino acid side-chains and the amino or 
carboxyl termini. It will be appreciated that the same type of modification may be present in the 
same or varying degrees at several sites in a given polypeptide. Also a given polypeptide may 
have many types of modifications. Modifications include acetylation, acylation, ADP- 
ribosylation, amidation, covalent attachment of flavin, covalent attachment of a heme moiety, 
covalent attachment of a nucleotide or nucleotide derivative, covalent attachment of a lipid or 
lipid derivative, covalent attachment of a phosphytidylinositol, cross-linking cyclization, 
disulfide bond formation, demethylation, formation of covalent cross-links, formation of 
cysteine, formation of pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI 
anchor formation, hydroxylation, iodination, methylation, myristolyation, oxidation, pegylation 
[[pergylation]], proteolytic processing, phosphorylation, prenylation, racemization, 
selenoylation, sulfation, and transfer-RNA mediated addition of amino acids to protein such as 
arginylation. (See Creighton, T.E., Proteins - Structure and Molecular Properties 2nd Ed ., W.H. 
Freeman and Company, New York (1993); Posttranslational Covalent Modification of Proteins, 
B.C. Johnson, Ed., Academic Press, New York, pp. 1-12 (1983)). 

Please replace paragraph [0048] on pages 1 1 to 12 with the following amended 
paragraph: 



differs from a reference sequence by one or more conservative or non-conservative amino acid 

substitutions, deletions, or insertions, particularly when such a substitution occurs at a site that is 

not the active site of the molecule, and provided that the polypeptide essentially retains its 

functional properties. A conservative amino acid substitution, for example, substitutes one 

amino acid for another of the same class (e.g., substitution of one hydrophobic amino acid, such 

as isoleucine [[isoleucin]], valine, leucine, or methionine, for another, or substitution of one polar 

amino acid for another, such as substitution of arginine for lysine, glutamic acid for aspartic acid 

or glutamine for asparagine). One or more amino acids can be deleted, for example, from an 

amidase polypeptide, resulting in modification of the structure of the polypeptide, without 

significantly altering its biological activity. For example, amino- or carboxyl-terminal amino 

acids that are not required for amidase biological activity can be removed. Modified polypeptide 
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sequences of the invention can be assayed for amidase biological activity by any number of 
methods, including contacting the modified polypeptide sequence with an amidase substrate and 
determining whether the modified polypeptide decreases the amount of specific substrate in the 
assay or increases the bioproducts of the enzymatic reaction of a functional amidase polypeptide 
with the substrate. 

Please replace paragraph [0052] on page 1 3 with the following amended paragraph: 
The term "variant" refers to polynucleotides or polypeptides of the invention 
modified at one or more base pairs, codons, introns, exons, or amino acid residues (respectively) 
yet still retain the biological activity of an amidase of the invention. Variants can be produced 
by any number of means included methods such as, for example, error-prone PGR, shuffling, 
oligonucleotide-directed mutagenesis, assembly PCR, sexual PCR mutagenesis, in vivo 
mutagenesis, cassette mutagenesis, recursive ensemble mutagenesis, exponential ensemble 
mutagenesis, site-specific mutagenesis, gene reassembly, gene site saturated mutagenesis 
(GSSM™) [[GSSM]] and any combination thereof. 



Please replace paragraph [0112] on pages 33 to 34 with the following amended 
paragraph: 

Optionally, the method comprises the additional step of screening the library 
members of the shuffled pool to identify individual shuffled library members having the ability 
to bind or otherwise interact, or catalyze a particular reaction (e.g., such as catalytic domain of an 
enzyme) with a predetermined macromolecule, such as for example a proteinaceous receptor, an 
oligosaccharide, virion [[viron]], or other predetermined compound or structure. 

Please replace paragraph [0116] on page 35 with the following amended paragraph: 

The invention also provides for the use of proprietary codon primers (containing a 
degenerate N,N,N sequence) to introduce point mutations into a polynucleotide, so as to generate 
a set of progeny polypeptides in which a full range of single amino acid substitutions is 
represented at each amino acid position (gene site saturated mutagenesis ( GSSM™) 
[[(GSSM)]]). The oligos used are comprised contiguously of a first homologous sequence, a 
degenerate N,N,N sequence, and preferably but not necessarily a second homologous sequence. 
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The downstream progeny translational products from the use of such oligos include all possible 
amino acid changes at each amino acid site along the polypeptide, because the degeneracy of the 
N,N,N sequence includes codons for all 20 amino acids. 

Please replace paragraph [0219] on pages 68 to 70 with the following amended 
paragraph: 

A "comparison window", as used herein, includes reference to a segment of any 
one of the number of contiguous positions selected from the group consisting of from 20 to 600, 
usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be 
compared to a reference sequence of the same number of contiguous positions after the two 
sequences are optimally aligned. Methods of alignment of sequence for comparison are well- 
known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the 
local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482, 1981, by the 
homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol 48:443, 1970, by the 
search for similarity method of person & Lipman, Proc. Natl Acad. Sci. USA 85:2444, 1988, by 
computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in 
the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., 
Madison, WI), or by manual alignment and visual inspection. Other algorithms for determining 
homology or identity include, for example, in addition to a BLAST program (Basic Local 
Alignment Search Tool at the National Center for Biological Information), ALIGN, AMAS 
(Analysis of Multiply Aligned Sequences), AMPS (Protein Multiple Sequence Alignment), 
ASSET (Aligned Segment Statistical Evaluation Tool), BANDS, BESTSCOR, BIOSCAN 
(Biological Sequence Comparative Analysis Node), BLIMPS (BLocks IMProved Searcher), 
FASTA, Intervals & Points, BMB, CLUSTAL V, CLUSTAL W, CONSENSUS, 
LCONSENSUS, WCONSENSUS, Smith- Waterman algorithm, DARWIN, Las Vegas algorithm, 
FNAT (Forced Nucleotide Alignment Tool), Framealign, Framesearch, DYNAMIC, FILTER, 
FSAP (Fristensky Sequence Analysis Package), GAP (Global Alignment Program), GENAL, 
GIBBS, GenQuest, ISSC (Sensitive Sequence Comparison), LALIGN (Local Sequence 
Alignment), LCP (Local Content Program), MACAW (Multiple Alignment Construction & 
Analysis Workbench), MAP (Multiple Alignment Program), MBLKP, MBLKN, PIMA (Pattern- 
Induced Multi-sequence Alignment), SAGA (Sequence Alignment by Genetic Algorithm) and 
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WHAT-IF. Such alignment programs can also be used to screen genome databases to identify 
polynucleotide sequences having substantially identical sequences. A number of genome 
databases are available, for example, a substantial portion of the human genome is available as 
part of the Human Genome Sequencing Project (J. Roach, 

http://weber.u.Washington.edu/- roach/human_ genome_ progress 2.html) (Gibbs, 1995). At 
least twenty-one other genomes have already been sequenced, including, for example, M 
genitalium (Fraser et aL, 1995), M. jannaschii (Bult et aL, 1996), K influenzae (Fleischmann et 
aL, 1995), E. coli (Blattner et aL, 1997), and yeast (S. cerevisiae) (Mewes et aL, 1997), and D. 
melanogaster (Adams et aL, 2000). Significant progress has also been made in sequencing the 
genomes of model organism, such as mouse, C. elegans, and Arabadopsis sp. Several databases 
containing genomic information annotated with some functional information are maintained by 
different organization, and are accessible via the interne t, for example, http://wwwtigr.org/tdb; 
http://wwvv.genetics.wisc.edu; http://genome www.stanford.edu/ - ball; http://hiv web.lanl.gov; 
http://www.ncbi.nlm.nih.gov: http://www.ebi.ac.uk; http://Pasteur.fr/other/biology; and http:// 
www, genome.wi.mit.edu . 

Please replace paragraph [0220] on pages 70 to 71 with the following amended 
paragraph: 

One example of a useful algorithm is BLAST and BLAST 2.0 algorithms, which 

are described in Altschul et aL, Nuc. Acids Res. 25:3389-3402, 1977, and Altschul et aL, J. Mol. 

Biol. 215:403-410, 1990, respectively. Software for performing BLAST analyses is publicly 

available through the National Center for Biotechnology Information 

[[(http://www.ncbi.nlm.nih.gov/)]]. This algorithm involves first identifying high scoring 

sequence pairs (HSPs) by identifying short words of length W in the query sequence, which 

either match or satisfy some positive-valued threshold score T when aligned with a word of the 

same length in a database sequence. T is referred to as the neighborhood word score threshold 

(Altschul et aL, supra). These initial neighborhood word hits act as seeds for initiating searches 

to find longer HSPs containing them. The word hits are extended in both directions along each 

sequence for as far as the cumulative alignment score can be increased. Cumulative scores are 

calculated using, for nucleotide sequences, the parameters M (reward score for a pair of 

matching residues; always >0). For amino acid sequences, a scoring matrix is used to calculate 
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the cumulative score. Extension of the word hits in each direction are halted when: the 
cumulative alignment score falls off by the quantity X from its maximum achieved value; the 
cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring 
residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters 
W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for 
nucleotide sequences) uses as defaults a wordlength (W) of 1 1 5 an expectation (E) of 10, M=5, 
N=-4 and a comparison of both strands. For amino acid sequences, the BLASTP program uses 
as defaults a wordlength of 3, and expectations (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915, 1989) alignments (B) of 50, 
expectation (E) of 10, M=5, N= -4, and a comparison of both strands. 

Please replace paragraph [0223] on page 72 with the following amended paragraph: 
The BLAST programs identify homologous sequences by identifying similar 
segments, which are referred to herein as "high-scoring segment pairs," between a query amino 
or nucleic acid sequence and a test sequence which is preferably obtained from a protein or 
nucleic acid sequence database. High-scoring segment pairs are preferably identified (i.e., 
aligned) by means of a scoring matrix, many of which are known in the art. Preferably, the 
scoring matrix used is the BLOSUM62 matrix (Gonnet et al, Science 256:1443-1445, 1992; 
Henikoff and Henikoff, Proteins 17:49-61, 1993). Less preferably, the PAM or PAM250 
matrices may also be used (see, e.g., Schwartz and Dayhoff, eds., 1978, Matrices for Detecting 
Distance Relationships: Atlas of Protein Sequence and Structure, Washington: National 
Biomedical Research Foundation). BLAST programs are accessible through the U.S. National 
Library of Medicine[[, e.g., at www. ncbi .nlm.nih. go v] ] . 

Please replace paragraph [0240] on pages 77 to 78 with the following amended 
paragraph: 



300 for detecting the presence of a feature in a sequence. The process 300 begins at a start state 

302 and then moves to a state 304 wherein a first sequence that is to be checked for features is 

stored to a memory 1 15 in the computer system 100. The process 300 then moves to a state 306 

wherein a database of sequence features is opened. Such a database would include a list of each 
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feature's attributes along with the name of the feature. For example, a feature name could be 
"Initiation Codon" and the attribute would be "ATG". Another example would be the feature 
name "TAATAA Box" and the feature attribute would be "TAATAA". An example of such a 
database is produced by the University of Wisconsin Genetics Computer Group 
[[(wwwigcg.com)]]. Alternatively, the features may be structural polypeptide motifs such as 
alpha helices, beta sheets, or functional polypeptide motifs such as enzymatic active sites, helix- 
turn-helix motifs or other motifs known to those skilled in the art. 
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